In [1]:
from __future__ import division
In [2]:
nums = [2,4,6]
[2/n for n in nums]
Out[2]:
Why did this happen?
??? (fill in the answer here)
There are ways to fix this:
__future__ import division #get it from python 3.xSay your ischool project partner gave you a list of stuff to you. It's in priority order already but you want each item to be numbered in order, you know, first do ANLP reading, then do ANLP homework, then do ANLP corpus selection, and oh yeah, maybe then do something for 202 and TUI. So you start with a list like
todo = ['anlp_reading', 'anlp_homework', 'anlp_corpus', '202_reading', 'tui_homework', 'tui_project']
and you want to turn it into
(0, ['anlp_reading'), (1, 'anlp_homework'), (2,'anlp_corpus'), (3, '202_reading'), (4, 'tui_homework'), (5, 'tui_project')]
Below, write code for a standard way to do this, either with a for loop or a list comprehension.
In [4]:
todo = ['anlp_reading', 'anlp_homework', 'anlp_corpus', '202_reading', 'tui_homework',
'tui_project']
i = 0
todo2 = []
for v in todo:
todo2.append((i, v))
i+=1
todo2
Out[4]:
Now here is the handy quick little way to do this faster: enumerate! This produces an iterator object, so to see its output all at once, wrap a list() around it, e.g,
list(enumerate(todo))
In [5]:
list(enumerate(todo))
Out[5]:
In English we can determine a lot of information about word forms by looking at the endings of the words. Python makes this very easy to do. For example, words that end in "ing" are often gerunds or else present participles. (The gerund has the same function as a noun but looks like a verb. The present particle is part of present tense.) Below, the code loads the text files from NLTK as described in Chapter 1.
Choose one of the texts and write one line of code that pulls out all words that end in 'ing' from that text file. (Hint: there is a special string command that does just what you want.)
In [6]:
import nltk
from nltk.book import *
gerunds = [w for w in text3 if w.endswith("ing")]
In [13]:
print gerunds
Look at the results and try to see what you can tell, without context, about those words -- are they nouns, gerunds, present participles, something else?
In the homework we looked a bit at how to get rid of some of the noise from a list of frequent words. Some standard approaches are to:
These are all fine things to do. An additional idea is to compare the common words from one collection to those of another and see how they differ. Those that differ but are still very common are probably quite interesting and signify something special about that collection, especially after some simple normalization steps.
To try this out, do the following steps:
In [ ]: